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Video encoding and decoding 



FIELD OF THE INVENTION 

The invention relates to a video encoder and a method of encoding images in a 
first resolution mode with reference to a reference image having said first resolution. The 
invention also relates to a corresponding video decoder and a method of decoding such 
5 images. 

BACKGROUND OF THE INVENTION 

Predictive video encoders and decoders as defined in the opening paragraph 

are generally known in the field of video compression. For example, the MPEG video 
10 compression standard specifies P-pictures as images which are encoded with reference to a 

previous image of the sequence. The previous image may be an I-picture, i.e. an image being 

autonomously encoded without reference to other images of the sequence, or another 

P-picture. The previous image is stored in a memory. 

The MPEG standard also specifies B-pictures as images which are encoded 
1 5 with reference to a previous image as well as a subsequent image. B-pictures are encoded 

more efficiently than P-pictures. However, the encoding of B-pictures requires the encoder to 

have twice the memory capacity and substantially twice the memory bandwidth. Similar 

considerations apply to the corresponding decoder. 

Designing an MPEG encoder is thus a matter of balancing circuitry 
20 complexity and memory capacity (i.e. chip area) versus compression efficiency. In view 

thereof, the company of Philips introduced an integrated circuit on the market, which allows 

I- and P-coding only. The circuit produces IPPP sequences of images having a resolution of 

720x576 pixels, usually referred to as '601' or 'Dl' resolution. 

25 OBJECT AND SUMMARY OF THE INVENTION 

It is an object of the invention to provide a more flexible video encoder and 

decoder. 

To this end, the video encoder in accordance with the invention is 
characterized in that the video encoder comprises control means for selectably encoding said 
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images in a second, lower resolution mode with reference to two reference images having 
said second resolution, and for storing said two reference images with the second resolution 
in said memory. It is thereby achieved that the same video encoder can produce B-pictures in 
a lower resolution mode with the same resources, in particular memory. The lower resolution 
5 is preferably half of the first resolution mode, e.g. 352x576 pixels, usually referred to as 
'V^Dl' resolution. 

Video encoders usually include a motion estimation circuit, which applies a 
predetermined search strategy in the first resolution mode to search motion vectors 
representing motion between an input image and the reference image. In an embodiment of 

1 0 the invention, said motion estimation circuit applies said search strategy in the second 

resolution mode to both reference images. This embodiment is based on the recognition that 
the time which is available for searching motion vectors in the first resolution mode allows 
twice searching such motion vectors in the lower resolution mode (at the same frame rate). In 
an MPEG encoder, in which B-pictures refer to a previous image as well as a subsequent 

1 5 image, the motion estimation circuit is thus used to search both the forward and backward 
motion vectors in the lower resolution mode. 

A further embodiment of the video encoder is based on the recognition that the 
double amount of time is available for encoding P-pictures (i.e. pictures encoded with 
reference to a single reference frame) compared with encoding of B-pictures. In accordance 

20 therewith, the motion estimation circuit is arranged to apply the search strategy in a first pass 
to search motion vectors with a first precision, and to apply said search strategy in a second 
pass to refine the precision of the motion vectors found in the first pass. It is thereby achieved 
that the motion vectors associated with P-pictures are more precise than the motion vectors 
associated with B-pictures. This is particularly attractive because P-pictures are generally 

25 wider apart from each other than B-pictures. 

BRIEF DESCRIPTION OF THE FIGURES 

Fig. 1 shows a schematic diagram of a video encoder in accordance with the 

invention. 

30 Figs. 2 and 3 show diagrams to illustrate the operation of the video encoder. 

Figs. 4A-4C show images to illustrate a two-pass motion vector search process 
carried out by a motion estimation and compensation circuit, which is shown in Fig. 1. 
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DESCRIPTION OF EMBODIMENTS 

The invention will now be described with reference to an MPEG encoder for 
producing IPPP sequences in Dl resolution and IBBP sequences in V^Dl resolution. That is, 
the encoder produces I and P-pictures in the Dl resolution mode, and I, B and P-pictures in 
the V2DI resolution mode. However, the invention is not restricted to video encoders or 
decoders complying with the MPEG standard. The essential aspect is that images are 
predictively encoded with reference to one reference image in one resolution mode and 
predictively encoded with reference to two reference images in a lower resolution mode. 

Fig. 1 shows a schematic diagram of an MPEG video encoder in accordance 
with the invention. The general layout is known per se in the art. The encoder comprises a 
subtracter 1, an orthogonal transform (e.g. DCT) circuit 2, a quantizer 3, a variable-length 
encoder 4, an inverse quantizer 5, an inverse transform circuit 6, an adder 7, a memory unit 8, 
and a motion estimation and compensation circuit 9. 

The memory unit 8 includes a memory 81 having a capacity for storing a 
reference image having a high resolution of, for example, 720x576 pixels (usually referred to 
as Dl). The same memory can store two reference images having substantially half said 
resolution, i.e. 360x576 pixels (usually referred to as V^Dl). This is symbolically shown in 
the Figure by two memory parts having reference numerals 81a and 81b. The memory unit 
further includes user-operable switches 82a and 82b for selectably switching the encoder into 
the high-resolution encoding mode or the low-resolution mode. 

In the high-resolution encoding mode, images having Dl -resolution are 
written into and read from memory 81 with the switches 82a and 82b in the position denoted 
H. Because only one image at this resolution can be stored at a time, the MPEG encoder can 
only produce I-pictures or P-pictures. As is generally known in the art of video coding, 
I-pictures are autonomously encoded images without reference to a previously encoded 
image. The subtracter 1 is inactive. The I-pictures are locally decoded and stored in memory 
81. P-pictures are predictively encoded with reference to a previous I or P-picture. The 
subtracter 1 is active. The subtracter 1 subtracts a motion-compensated prediction image X p 
from the input image Xj, so that the difference is encoded and transmitted. The adder 7 adds 
the locally decoded image to the prediction image so as to update the stored reference image. 

In the low-resolution mode, images having 'AD 1 -resolution are written into 
and read from memories 81a and 81b with the switches 82a and 82b in the position denoted 
L. In this encoding mode, two further switches 83 and 84 are operated. Switch 83 controls 
which one of the memories is read by the motion estimator, switch 84 controls in which one 
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of the memories the locally decoded image is stored. Note that the switches in memory unit 8 
are implemented as software-controlled memory-addressing operations in practical 
embodiments of the encoder. 

In the low-resolution mode, the encoder operates as follows. I-pictures are 
again encoded with subtracter 1 being inoperative. The locally decoded I-picture is written 
into memory 81a (switch 84 in position a). The first P-picture is predictively encoded with 
reference to the stored I-picture (switch 83 in position a), and its locally decoded version is 
written into memory 81b (switch 84 in position b). Subsequent P-pictures are alternately read 
from and written into the memories 81a and 81b, so that memory 8 keeps the last two I or 
P-pictures at any time. This allows bi-directional predictive coding of images (B-pictures) in 
the low-resolution mode. 

B-pictures are encoded with reference to a previous and a subsequent I or 
P-picture. Note that this requires the encoding order of images to be different from the 
display order. Circuitry therefor is known in the art and not shown in the Figure. The motion 
estimation and compensation circuit 9 now accesses both memories 81a and 81b to generate 
forward motion vectors (referring to the previous image) and backward motion vectors 
(referring to the subsequent image). To this end, the switch 83 switches between position a 
and position b. Adder 7 is inoperative during B-encoding. 

Fig. 2 shows a timing diagram to summarize the operation of the encoder. The 
diagram shows the positions of switches 83 and 84 during consecutive frame periods for 
encoding an IBBPBBP sequence. The frames are identified by encoding type (I, B, P) and 
display order. II is the first frame, B2 is the second frame, B3 is the third frame, P4 is the 
fifth frame, etc. Switching between the two memories in the B-encoding mode is shown on a 
frame-by-frame basis for simplicity. In practice, the switching is done at the macroblock 
level. 

The motion estimation circuit executes a given motion vector search process. 
Said process requires reading of the respective memory for a given number of times, say N, 
in the low-resolution mode. The same process requires 2N memory accesses per frame in the 
high-resolution mode. As Fig. 2 clarifies, encoding of B-pictures requires 2N memory 
accesses per frame period in the low-resolution mode. Accordingly, the memory bandwidth 
requirements are substantially the same in the high-resolution mode and the low-resolution 
mode. The feature of B-encoding in the low-resolution mode thus does not require additional 
hardware or software resources. This is a significant advantage of the invention. 
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Fig. 2 further reveals that the vector search process requires N memory 
accesses per frame in the P-encoding mode, whereas 2N accesses are available. This 
recognition is exploited in a further aspect of the invention. To this end, the motion vector 
search process is carried out in two passes for P-pictures. In the first pass, the motion vectors 
5 are found with a 'standard' precision. In the second pass, the search process is continued to 
further refine the accuracy of the motion vectors that were found in the first pass. The two- 
pass operation is illustrated in Fig. 3, the refining pass being denoted by a' or b', as the case 
may be. Note again that the two-pass operation is carried out in practice on a macroblock-by- 
macroblock basis. 

1 0 Figs. 4A-4C show parts of an image to further illustrate the two-pass motion 

estimation process. Fig. 4A shows a current image 400 to be predictively encoded. The image 
is divided into macroblocks. A current macroblock to be encoded includes an object 401. 
Reference numerals 41, 42, 43 and 44 denote motion vectors already found during encoding 
of the neighboring macroblocks. Figs. 4B and 4C show the previous I or P-picture 402 stored 

15 in one of the memories 81a or 81b, as the case may be. In the previous reference image, the 
object (now denoted 403) is at a different position and has a slightly different shape. In this 
example, the motion estimator searches the best motion vector from among a number of 
candidate motion vectors. Various strategies for selecting suitable candidate motion vectors 
are known in the art. It is here assumed that the motion vectors denoted 41, 42, 43 and 44 in 

20 Fig. 4A are among the candidate motion vectors for the current macroblock. Fig. 4B shows 
the result of the first motion vector search process pass. It appears that candidate motion 
vector 43 provides the best match between the current macroblock of the input image and an 
equally sized block 404 of the reference image. 

In the second pass, the motion vector search is applied with different candidate 

25 vectors. More particularly, the motion vector found in the first pass is one candidate motion 
vector. Other candidate vectors are further refinements thereof. This is illustrated in Fig. 4C, 
where 43 is the motion vector found in the first pass and eight dots 45 represent end points of 
new candidate motion vectors. They differ from motion vector 43 by one (or one-half) pixel. 
The search algorithm is now carried out with the new candidate vectors. It appears in this 

30 example that block 405 best resembles the current macroblock. Accordingly, motion vector 
46 is the motion vector, which is used for producing the motion-compensated prediction 
image X p . The two-pass operation for P-pictures is particularly attractive because it provides 
more accurate motion vectors for images that are wider apart than B-pictures. 
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It is to be noted that the two-pass motion vector search can also be applied to 
B-pictures in a yet lower resolution mode (SIF, 352x288 pixels). The inventive idea of using 
available memory and motion estimation circuitry for enhancing the image quality or 
reducing the bit rate at lower resolutions can also be applied to other resources of the video 
5 encoder. For example, the 'overcapacity' of transform circuits 2,6, quantizers 3,5 and 

variable-length encoder 4 in Fig. 1 allows two-pass encoding in which the first pass is used as 
a step of analyzing image complexity, and the second pass is used for actual coding. 

It is further noted that the invention is also applicable in multi-resolution video 
decoders. Since a decoder corresponds to the local decoding loop of the encoder as described 
1 0 above, a separate description thereof is not necessary. 

The invention can be summarized as follows. A video encoder is usually 
designed to have a given performance at a given resolution. For example, MPEG2 encoders 
are known that compress video at '601' resolution (720x576 pixels) into IPPP sequences 
using 2 MB of RAM. The invention provides the feature of selectably (82a, 82b) encoding 
1 5 images in a lower resolution mode. The spare capacity of resources in the low-resolution 
mode (e.g. memory capacity and memory bandwidth) is used to improve the performance 
(e.g. higher image quality, lower bit rate). More particularly, the RAM (81) and motion 
estimator (9) required for producing P-pictures in the high-resolution mode are arranged (83, 
84) to produce B-pictures in the low-resolution mode. 



